Overview

Dataset statistics

Number of variables11
Number of observations39637
Missing cells0
Missing cells (%)0.0%
Duplicate rows10
Duplicate rows (%)< 0.1%
Total size in memory4.6 MiB
Average record size in memory122.7 B

Variable types

Numeric9
Categorical2

Alerts

Dataset has 10 (< 0.1%) duplicate rowsDuplicates
salinity is highly overall correlated with ORP and 6 other fieldsHigh correlation
turbidity is highly overall correlated with Pressure in and 1 other fieldsHigh correlation
ORP is highly overall correlated with salinity and 6 other fieldsHigh correlation
TDS is highly overall correlated with salinity and 6 other fieldsHigh correlation
Pressure in is highly overall correlated with salinity and 6 other fieldsHigh correlation
Pressure out is highly overall correlated with salinity and 6 other fieldsHigh correlation
Human Counter is highly overall correlated with ORPHigh correlation
temperature is highly overall correlated with salinity and 4 other fieldsHigh correlation
PH is highly overall correlated with salinity and 5 other fieldsHigh correlation
pump current is highly overall correlated with salinity and 5 other fieldsHigh correlation
turbidity is highly skewed (γ1 = -55.7047875)Skewed
ORP is highly skewed (γ1 = 20.66558326)Skewed
Pressure in is highly skewed (γ1 = -58.60490481)Skewed
Pressure out is highly skewed (γ1 = 47.0549401)Skewed
pump current is highly skewed (γ1 = 60.00157855)Skewed
pump current has 37477 (94.6%) zerosZeros
Human Counter has 8333 (21.0%) zerosZeros

Reproduction

Analysis started2022-12-23 21:05:02.019627
Analysis finished2022-12-23 21:05:17.043871
Duration15.02 seconds
Software versionpandas-profiling vv3.5.0
Download configurationconfig.json

Variables

salinity
Real number (ℝ)

Distinct1642
Distinct (%)4.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean277.10764
Minimum0
Maximum557.575
Zeros2
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size619.3 KiB
2022-12-23T16:05:17.128870image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile268.422
Q1273.449
median276.632
Q3280.556
95-th percentile283.233
Maximum557.575
Range557.575
Interquartile range (IQR)7.107

Descriptive statistics

Standard deviation8.0206101
Coefficient of variation (CV)0.028944024
Kurtosis494.22867
Mean277.10764
Median Absolute Deviation (MAD)3.599
Skewness11.31336
Sum10983715
Variance64.330187
MonotonicityNot monotonic
2022-12-23T16:05:17.226874image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
274.281 142
 
0.4%
280.773 136
 
0.3%
274.444 134
 
0.3%
274.389 133
 
0.3%
274.335 132
 
0.3%
274.317 129
 
0.3%
280.719 119
 
0.3%
274.353 119
 
0.3%
274.462 119
 
0.3%
280.755 118
 
0.3%
Other values (1632) 38356
96.8%
ValueCountFrequency (%)
0 2
 
< 0.1%
259.85 1
 
< 0.1%
264.552 1
 
< 0.1%
264.57 3
 
< 0.1%
264.588 5
 
< 0.1%
264.606 7
 
< 0.1%
264.624 24
0.1%
264.642 16
< 0.1%
264.66 29
0.1%
264.678 38
0.1%
ValueCountFrequency (%)
557.575 11
< 0.1%
335.063 1
 
< 0.1%
335.009 1
 
< 0.1%
334.882 1
 
< 0.1%
334.792 1
 
< 0.1%
334.683 1
 
< 0.1%
334.629 5
< 0.1%
334.484 1
 
< 0.1%
334.376 1
 
< 0.1%
334.34 1
 
< 0.1%

turbidity
Real number (ℝ)

HIGH CORRELATION
SKEWED

Distinct712
Distinct (%)1.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean23.407751
Minimum-4375.26
Maximum46.9456
Zeros2
Zeros (%)< 0.1%
Negative2837
Negative (%)7.2%
Memory size619.3 KiB
2022-12-23T16:05:17.330873image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum-4375.26
5-th percentile-13.74764
Q125.0007
median28.4492
Q333.4651
95-th percentile36.6001
Maximum46.9456
Range4422.2056
Interquartile range (IQR)8.4644

Descriptive statistics

Standard deviation75.140826
Coefficient of variation (CV)3.210083
Kurtosis3256.5161
Mean23.407751
Median Absolute Deviation (MAD)4.7024
Skewness-55.704788
Sum927813.04
Variance5646.1437
MonotonicityNot monotonic
2022-12-23T16:05:17.432098image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
33.4651 1108
 
2.8%
33.3083 823
 
2.1%
24.6873 799
 
2.0%
33.6218 761
 
1.9%
25.9412 714
 
1.8%
33.1516 714
 
1.8%
33.7786 684
 
1.7%
34.7192 674
 
1.7%
24.844 619
 
1.6%
25.6277 601
 
1.5%
Other values (702) 32140
81.1%
ValueCountFrequency (%)
-4375.26 11
< 0.1%
-66.0706 1
 
< 0.1%
-65.7571 1
 
< 0.1%
-65.6003 2
 
< 0.1%
-65.2869 2
 
< 0.1%
-65.1301 2
 
< 0.1%
-64.9734 1
 
< 0.1%
-64.8167 2
 
< 0.1%
-64.6599 1
 
< 0.1%
-64.5032 1
 
< 0.1%
ValueCountFrequency (%)
46.9456 1
 
< 0.1%
46.4753 2
< 0.1%
46.3186 1
 
< 0.1%
46.0051 1
 
< 0.1%
45.2214 1
 
< 0.1%
45.0645 3
< 0.1%
44.9077 1
 
< 0.1%
43.8105 1
 
< 0.1%
43.4971 1
 
< 0.1%
43.0269 1
 
< 0.1%

ORP
Real number (ℝ)

HIGH CORRELATION
SKEWED

Distinct1182
Distinct (%)3.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean742.17061
Minimum0
Maximum3002.87
Zeros2
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size619.3 KiB
2022-12-23T16:05:17.538411image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile658.037
Q1740.864
median756.96
Q3763.289
95-th percentile776.943
Maximum3002.87
Range3002.87
Interquartile range (IQR)22.425

Descriptive statistics

Standard deviation53.089104
Coefficient of variation (CV)0.07153221
Kurtosis913.96099
Mean742.17061
Median Absolute Deviation (MAD)12.84
Skewness20.665583
Sum29417416
Variance2818.453
MonotonicityNot monotonic
2022-12-23T16:05:17.640982image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
761.029 219
 
0.6%
761.209 203
 
0.5%
760.667 197
 
0.5%
760.938 194
 
0.5%
760.577 193
 
0.5%
760.848 191
 
0.5%
740.684 190
 
0.5%
761.481 187
 
0.5%
761.119 186
 
0.5%
761.39 184
 
0.5%
Other values (1172) 37693
95.1%
ValueCountFrequency (%)
0 2
< 0.1%
56.4569 1
< 0.1%
121.471 1
< 0.1%
299.874 1
< 0.1%
341.468 1
< 0.1%
487.501 1
< 0.1%
490.123 1
< 0.1%
492.745 1
< 0.1%
494.011 1
< 0.1%
497.176 1
< 0.1%
ValueCountFrequency (%)
3002.87 11
< 0.1%
781.826 1
 
< 0.1%
781.735 1
 
< 0.1%
781.012 3
 
< 0.1%
780.922 1
 
< 0.1%
780.741 6
< 0.1%
780.65 1
 
< 0.1%
780.56 4
 
< 0.1%
780.469 2
 
< 0.1%
780.289 4
 
< 0.1%

PH
Real number (ℝ)

Distinct1468
Distinct (%)3.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.2382351
Minimum-0.298727
Maximum20.7401
Zeros2
Zeros (%)< 0.1%
Negative2
Negative (%)< 0.1%
Memory size619.3 KiB
2022-12-23T16:05:17.750984image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum-0.298727
5-th percentile0.0974754
Q17.38661
median7.4113
Q37.45244
95-th percentile7.50624
Maximum20.7401
Range21.038827
Interquartile range (IQR)0.06583

Descriptive statistics

Standard deviation2.7144671
Coefficient of variation (CV)0.43513383
Kurtosis1.5034913
Mean6.2382351
Median Absolute Deviation (MAD)0.03544
Skewness-1.757885
Sum247264.92
Variance7.3683317
MonotonicityNot monotonic
2022-12-23T16:05:17.848983image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.0974754 1531
 
3.9%
0.0981084 1513
 
3.8%
0.0987413 1215
 
3.1%
0.0968425 742
 
1.9%
0.0993743 597
 
1.5%
7.45497 380
 
1.0%
7.40054 299
 
0.8%
7.39737 294
 
0.7%
7.3999 293
 
0.7%
7.4056 290
 
0.7%
Other values (1458) 32483
82.0%
ValueCountFrequency (%)
-0.298727 2
 
< 0.1%
0 2
 
< 0.1%
0.0955765 2
 
< 0.1%
0.0962095 189
 
0.5%
0.0968425 742
1.9%
0.0974754 1531
3.9%
0.0981084 1513
3.8%
0.0987413 1215
3.1%
0.0993743 597
 
1.5%
0.100007 227
 
0.6%
ValueCountFrequency (%)
20.7401 11
< 0.1%
13.9561 1
 
< 0.1%
12.1724 1
 
< 0.1%
11.8553 1
 
< 0.1%
11.5046 1
 
< 0.1%
10.5875 1
 
< 0.1%
10.5115 1
 
< 0.1%
9.9349 1
 
< 0.1%
9.85324 1
 
< 0.1%
9.50575 1
 
< 0.1%

TDS
Real number (ℝ)

Distinct1640
Distinct (%)4.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean277.10733
Minimum0
Maximum557.575
Zeros2
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size619.3 KiB
2022-12-23T16:05:17.948983image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile268.422
Q1273.431
median276.632
Q3280.574
95-th percentile283.233
Maximum557.575
Range557.575
Interquartile range (IQR)7.143

Descriptive statistics

Standard deviation8.0204596
Coefficient of variation (CV)0.028943513
Kurtosis494.26481
Mean277.10733
Median Absolute Deviation (MAD)3.599
Skewness11.313758
Sum10983703
Variance64.327773
MonotonicityNot monotonic
2022-12-23T16:05:18.041981image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
274.389 133
 
0.3%
274.281 131
 
0.3%
280.773 130
 
0.3%
274.317 129
 
0.3%
274.444 128
 
0.3%
274.462 127
 
0.3%
274.353 126
 
0.3%
274.335 125
 
0.3%
274.245 123
 
0.3%
280.719 120
 
0.3%
Other values (1630) 38365
96.8%
ValueCountFrequency (%)
0 2
 
< 0.1%
259.85 1
 
< 0.1%
264.552 1
 
< 0.1%
264.57 4
 
< 0.1%
264.588 4
 
< 0.1%
264.606 7
 
< 0.1%
264.624 22
0.1%
264.642 17
< 0.1%
264.66 33
0.1%
264.678 36
0.1%
ValueCountFrequency (%)
557.575 11
< 0.1%
335.063 1
 
< 0.1%
335.009 1
 
< 0.1%
334.882 1
 
< 0.1%
334.792 1
 
< 0.1%
334.683 1
 
< 0.1%
334.629 5
< 0.1%
334.484 1
 
< 0.1%
334.376 1
 
< 0.1%
334.104 1
 
< 0.1%

Pressure in
Real number (ℝ)

HIGH CORRELATION
SKEWED

Distinct260
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.5166765
Minimum-5.92575
Maximum2.54503
Zeros2
Zeros (%)< 0.1%
Negative11
Negative (%)< 0.1%
Memory size619.3 KiB
2022-12-23T16:05:18.144981image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum-5.92575
5-th percentile2.50741
Q12.51212
median2.52044
Q32.52351
95-th percentile2.53165
Maximum2.54503
Range8.47078
Interquartile range (IQR)0.01139

Descriptive statistics

Standard deviation0.14200144
Coefficient of variation (CV)0.056424191
Kurtosis3469.8809
Mean2.5166765
Median Absolute Deviation (MAD)0.00687
Skewness-58.604905
Sum99753.507
Variance0.020164408
MonotonicityNot monotonic
2022-12-23T16:05:18.245497image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2.52351 2036
 
5.1%
2.52333 1517
 
3.8%
2.51212 1449
 
3.7%
2.5123 1413
 
3.6%
2.52206 1300
 
3.3%
2.52315 1287
 
3.2%
2.51248 1238
 
3.1%
2.51989 1154
 
2.9%
2.52224 1133
 
2.9%
2.52007 1131
 
2.9%
Other values (250) 25979
65.5%
ValueCountFrequency (%)
-5.92575 11
< 0.1%
0 2
 
< 0.1%
2.47106 1
 
< 0.1%
2.47848 1
 
< 0.1%
2.48011 1
 
< 0.1%
2.48029 1
 
< 0.1%
2.48119 1
 
< 0.1%
2.48282 1
 
< 0.1%
2.483 2
 
< 0.1%
2.48535 1
 
< 0.1%
ValueCountFrequency (%)
2.54503 1
 
< 0.1%
2.54485 1
 
< 0.1%
2.54467 6
 
< 0.1%
2.54449 2
 
< 0.1%
2.54322 1
 
< 0.1%
2.54286 1
 
< 0.1%
2.54232 6
 
< 0.1%
2.54214 7
 
< 0.1%
2.54196 14
< 0.1%
2.54178 19
< 0.1%

Pressure out
Real number (ℝ)

HIGH CORRELATION
SKEWED

Distinct343
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.5187015
Minimum0
Maximum5.92575
Zeros2
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size619.3 KiB
2022-12-23T16:05:18.362496image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile2.50615
Q12.51085
median2.51935
Q32.52188
95-th percentile2.53038
Maximum5.92575
Range5.92575
Interquartile range (IQR)0.01103

Descriptive statistics

Standard deviation0.060012078
Coefficient of variation (CV)0.023826595
Kurtosis3037.1368
Mean2.5187015
Median Absolute Deviation (MAD)0.00669
Skewness47.05494
Sum99833.77
Variance0.0036014495
MonotonicityNot monotonic
2022-12-23T16:05:18.462495image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2.52188 1914
 
4.8%
2.51085 1791
 
4.5%
2.52206 1689
 
4.3%
2.51103 1548
 
3.9%
2.5208 1403
 
3.5%
2.51899 1293
 
3.3%
2.52224 1256
 
3.2%
2.5217 1185
 
3.0%
2.52062 1159
 
2.9%
2.51121 1152
 
2.9%
Other values (333) 25247
63.7%
ValueCountFrequency (%)
0 2
< 0.1%
2.46727 1
 
< 0.1%
2.47143 1
 
< 0.1%
2.47305 1
 
< 0.1%
2.4745 1
 
< 0.1%
2.47902 1
 
< 0.1%
2.47938 1
 
< 0.1%
2.47956 3
< 0.1%
2.47975 1
 
< 0.1%
2.48155 1
 
< 0.1%
ValueCountFrequency (%)
5.92575 11
< 0.1%
2.5633 1
 
< 0.1%
2.56311 2
 
< 0.1%
2.56275 1
 
< 0.1%
2.56221 1
 
< 0.1%
2.56185 1
 
< 0.1%
2.55968 1
 
< 0.1%
2.5595 1
 
< 0.1%
2.55896 1
 
< 0.1%
2.55877 1
 
< 0.1%

pump current
Real number (ℝ)

HIGH CORRELATION
SKEWED
ZEROS

Distinct464
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15.868839
Minimum0
Maximum56183.1
Zeros37477
Zeros (%)94.6%
Negative0
Negative (%)0.0%
Memory size619.3 KiB
2022-12-23T16:05:18.567533image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile5.233 × 10-5
Maximum56183.1
Range56183.1
Interquartile range (IQR)0

Descriptive statistics

Standard deviation935.84463
Coefficient of variation (CV)58.973728
Kurtosis3598.5184
Mean15.868839
Median Absolute Deviation (MAD)0
Skewness60.001579
Sum628993.19
Variance875805.17
MonotonicityNot monotonic
2022-12-23T16:05:18.666934image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 37477
94.6%
5.233 × 10-51482
 
3.7%
0.00020931 43
 
0.1%
56183.1 11
 
< 0.1%
0.0102562 5
 
< 0.1%
0.00523278 5
 
< 0.1%
1.27345 4
 
< 0.1%
6.30074 4
 
< 0.1%
5.2916 4
 
< 0.1%
0.152588 4
 
< 0.1%
Other values (454) 598
 
1.5%
ValueCountFrequency (%)
0 37477
94.6%
5.233 × 10-51482
 
3.7%
0.00020931 43
 
0.1%
0.00047095 2
 
< 0.1%
0.00083725 3
 
< 0.1%
0.0018838 3
 
< 0.1%
0.00523278 5
 
< 0.1%
0.00633166 1
 
< 0.1%
0.00753521 2
 
< 0.1%
0.0088434 1
 
< 0.1%
ValueCountFrequency (%)
56183.1 11
< 0.1%
837.245 1
 
< 0.1%
306.959 1
 
< 0.1%
252.576 1
 
< 0.1%
219.693 1
 
< 0.1%
182.398 1
 
< 0.1%
178.512 1
 
< 0.1%
174.093 1
 
< 0.1%
173.14 1
 
< 0.1%
164.865 1
 
< 0.1%

Human Counter
Real number (ℝ)

HIGH CORRELATION
ZEROS

Distinct13
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7.8640412
Minimum0
Maximum15
Zeros8333
Zeros (%)21.0%
Negative0
Negative (%)0.0%
Memory size619.3 KiB
2022-12-23T16:05:18.762635image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q13
median7
Q315
95-th percentile15
Maximum15
Range15
Interquartile range (IQR)12

Descriptive statistics

Standard deviation5.5183122
Coefficient of variation (CV)0.70171456
Kurtosis-1.2692098
Mean7.8640412
Median Absolute Deviation (MAD)7
Skewness-0.024250063
Sum311707
Variance30.451769
MonotonicityNot monotonic
2022-12-23T16:05:18.833151image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=13)
ValueCountFrequency (%)
15 11569
29.2%
6 9311
23.5%
0 8333
21.0%
9 5364
13.5%
10 2261
 
5.7%
3 1456
 
3.7%
7 858
 
2.2%
1 310
 
0.8%
4 122
 
0.3%
2 37
 
0.1%
Other values (3) 16
 
< 0.1%
ValueCountFrequency (%)
0 8333
21.0%
1 310
 
0.8%
2 37
 
0.1%
3 1456
 
3.7%
4 122
 
0.3%
6 9311
23.5%
7 858
 
2.2%
8 1
 
< 0.1%
9 5364
13.5%
10 2261
 
5.7%
ValueCountFrequency (%)
15 11569
29.2%
12 1
 
< 0.1%
11 14
 
< 0.1%
10 2261
 
5.7%
9 5364
13.5%
8 1
 
< 0.1%
7 858
 
2.2%
6 9311
23.5%
4 122
 
0.3%
3 1456
 
3.7%

temperature
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size619.3 KiB
74.2574
39635 
0.0
 
2

Length

Max length7
Median length7
Mean length6.9997982
Min length3

Characters and Unicode

Total characters277451
Distinct characters6
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row74.2574
2nd row74.2574
3rd row74.2574
4th row74.2574
5th row74.2574

Common Values

ValueCountFrequency (%)
74.2574 39635
> 99.9%
0.0 2
 
< 0.1%

Length

2022-12-23T16:05:18.919149image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2022-12-23T16:05:19.020151image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
74.2574 39635
> 99.9%
0.0 2
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
7 79270
28.6%
4 79270
28.6%
. 39637
14.3%
2 39635
14.3%
5 39635
14.3%
0 4
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 237814
85.7%
Other Punctuation 39637
 
14.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
7 79270
33.3%
4 79270
33.3%
2 39635
16.7%
5 39635
16.7%
0 4
 
< 0.1%
Other Punctuation
ValueCountFrequency (%)
. 39637
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 277451
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
7 79270
28.6%
4 79270
28.6%
. 39637
14.3%
2 39635
14.3%
5 39635
14.3%
0 4
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 277451
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
7 79270
28.6%
4 79270
28.6%
. 39637
14.3%
2 39635
14.3%
5 39635
14.3%
0 4
 
< 0.1%

water level
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size619.3 KiB
500.0
38835 
800.0
 
802

Length

Max length5
Median length5
Mean length5
Min length5

Characters and Unicode

Total characters198185
Distinct characters4
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row500.0
2nd row500.0
3rd row500.0
4th row500.0
5th row500.0

Common Values

ValueCountFrequency (%)
500.0 38835
98.0%
800.0 802
 
2.0%

Length

2022-12-23T16:05:19.099148image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2022-12-23T16:05:19.182148image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
500.0 38835
98.0%
800.0 802
 
2.0%

Most occurring characters

ValueCountFrequency (%)
0 118911
60.0%
. 39637
 
20.0%
5 38835
 
19.6%
8 802
 
0.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 158548
80.0%
Other Punctuation 39637
 
20.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 118911
75.0%
5 38835
 
24.5%
8 802
 
0.5%
Other Punctuation
ValueCountFrequency (%)
. 39637
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 198185
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 118911
60.0%
. 39637
 
20.0%
5 38835
 
19.6%
8 802
 
0.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 198185
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 118911
60.0%
. 39637
 
20.0%
5 38835
 
19.6%
8 802
 
0.4%

Interactions

2022-12-23T16:05:15.721167image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-23T16:05:08.865787image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-23T16:05:09.805784image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-23T16:05:10.628372image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-23T16:05:11.460391image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-23T16:05:12.401414image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-23T16:05:13.235006image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-23T16:05:14.056165image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-23T16:05:14.889165image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-23T16:05:15.817166image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-23T16:05:08.965785image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-23T16:05:09.895372image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-23T16:05:10.724372image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-23T16:05:11.549393image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-23T16:05:12.495660image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-23T16:05:13.325011image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-23T16:05:14.147167image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-23T16:05:14.982164image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-23T16:05:15.910164image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-23T16:05:09.161783image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-23T16:05:09.981373image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-23T16:05:10.813373image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-23T16:05:11.634392image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-23T16:05:12.585657image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-23T16:05:13.415012image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-23T16:05:14.240164image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-23T16:05:15.071165image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-23T16:05:16.186629image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-23T16:05:09.252784image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-23T16:05:10.072373image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-23T16:05:10.906372image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-23T16:05:11.733433image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-23T16:05:12.681010image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-23T16:05:13.510007image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-23T16:05:14.334164image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-23T16:05:15.164164image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-23T16:05:16.277632image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-23T16:05:09.340785image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-23T16:05:10.160371image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-23T16:05:10.993881image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-23T16:05:11.817923image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-23T16:05:12.770008image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-23T16:05:13.597008image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-23T16:05:14.422164image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-23T16:05:15.255167image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-23T16:05:16.373628image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-23T16:05:09.430784image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-23T16:05:10.260371image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-23T16:05:11.085391image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-23T16:05:11.903922image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-23T16:05:12.861011image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-23T16:05:13.690120image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-23T16:05:14.513164image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-23T16:05:15.345168image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-23T16:05:16.464629image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-23T16:05:09.518783image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-23T16:05:10.348372image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-23T16:05:11.176392image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-23T16:05:11.989414image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-23T16:05:12.950011image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-23T16:05:13.778164image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-23T16:05:14.604164image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-23T16:05:15.434164image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-23T16:05:16.574650image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-23T16:05:09.611783image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-23T16:05:10.442371image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-23T16:05:11.272393image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-23T16:05:12.079414image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-23T16:05:13.045008image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-23T16:05:13.870167image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-23T16:05:14.701165image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-23T16:05:15.529167image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-23T16:05:16.673629image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-23T16:05:09.711785image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-23T16:05:10.532372image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-23T16:05:11.364393image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-23T16:05:12.307414image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-23T16:05:13.138008image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-23T16:05:13.960166image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-23T16:05:14.793164image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-23T16:05:15.621165image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Correlations

2022-12-23T16:05:19.254148image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Auto

The auto setting is an interpretable pairwise column metric of the following mapping:
  • Variable_type-Variable_type : Method, Range
  • Categorical-Categorical : Cramer's V, [0,1]
  • Numerical-Categorical : Cramer's V, [0,1] (using a discretized numerical column)
  • Numerical-Numerical : Spearman's ρ, [-1,1]
The number of bins used in the discretization for the Numerical-Categorical column pair can be changed using config.correlations["auto"].n_bins. The number of bins affects the granularity of the association you wish to measure.

This configuration uses the recommended metric for each pair of columns.
2022-12-23T16:05:19.394150image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-12-23T16:05:19.525148image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-12-23T16:05:19.904823image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-12-23T16:05:20.026553image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.
2022-12-23T16:05:20.115552image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-12-23T16:05:16.801630image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
A simple visualization of nullity by column.
2022-12-23T16:05:16.947739image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

salinityturbidityORPPHTDSPressure inPressure outpump currentHuman Countertemperaturewater level
time_stamp
2022-11-01 11:01:16287.189.51774.057.42287.182.542.530.000.0074.26500.00
2022-11-01 11:01:26287.189.36773.607.42287.182.542.530.000.0074.26500.00
2022-11-01 11:01:36287.129.36773.877.42287.122.542.530.000.0074.26500.00
2022-11-01 11:01:46287.258.57773.967.41287.252.542.530.000.0074.26500.00
2022-11-01 11:01:57287.188.89773.607.42287.162.542.530.000.0074.26500.00
2022-11-01 11:02:07287.188.57774.147.41287.182.542.530.000.0074.26500.00
2022-11-01 11:02:17287.219.04774.147.41287.212.542.530.000.0074.26500.00
2022-11-01 11:02:27287.218.89773.787.41287.212.542.530.000.0074.26500.00
2022-11-01 11:02:37287.168.73773.697.41287.162.542.530.000.0074.26500.00
2022-11-01 11:02:46287.238.73773.967.41287.162.542.530.000.0074.26500.00
salinityturbidityORPPHTDSPressure inPressure outpump currentHuman Countertemperaturewater level
time_stamp
2022-11-07 18:58:35288.1332.99722.607.59288.132.522.510.000.0074.26500.00
2022-11-07 19:58:45287.3932.68721.427.59287.392.522.510.000.0074.26500.00
2022-11-07 19:58:55287.4632.84721.337.59287.462.522.510.000.0074.26500.00
2022-11-07 20:59:05286.7232.05718.537.59286.722.522.510.000.0074.26500.00
2022-11-07 20:59:15286.7431.74718.627.58286.742.522.510.000.0074.26500.00
2022-11-07 21:59:25286.0531.27719.627.60286.052.522.510.000.0074.26500.00
2022-11-07 21:59:35286.0231.11719.437.59286.022.522.510.000.0074.26500.00
2022-11-07 22:59:44285.2931.27717.087.59285.292.522.520.000.0074.26500.00
2022-11-07 22:59:54285.2630.96717.367.59285.262.522.510.000.0074.26500.00
2022-11-07 23:59:54284.6631.11715.187.58284.662.522.510.000.0074.26500.00

Duplicate rows

Most frequently occurring

salinityturbidityORPPHTDSPressure inPressure outpump currentHuman Countertemperaturewater level# duplicates
9557.58-4375.263002.8720.74557.58-5.935.9356183.100.0074.26500.0011
00.000.000.000.000.000.000.000.000.000.00500.002
1274.2332.37777.497.43274.232.532.530.000.0074.26500.002
2274.2332.37777.677.44274.232.532.530.000.0074.26500.002
3275.6230.64743.677.44275.622.512.510.0010.0074.26500.002
4332.5014.97677.575.35332.502.542.540.000.0074.26500.002
5332.5814.50677.755.35332.582.542.540.000.0074.26500.002
6333.1814.81677.665.35333.182.542.540.000.0074.26500.002
7333.1814.81677.665.35333.182.542.540.000.0074.26500.002
8333.8014.97677.575.34333.802.542.540.000.0074.26500.002